Evaluating the Effectiveness of Prompts and Responses in Generative Artificial Intelligence Systems

Ajay Bandi and Ruida Zeng

October 2025

Abstract

Our paper proposes a comprehensive framework to evaluate the effectiveness of prompts and the corresponding responses generated by Generative Artificial Intelligence (GenAI) systems. To do so, our evaluation framework incorporates both objective metrics (accuracy, speed, relevancy, and format) and subjective metrics (coherence, tone, clarity, verbosity, and user satisfaction). A sample evaluation is performed on prompts send to Gemini and ChatGPT GenAI models. Additionally, our evaluation framework employs various feedback mechanisms, such as surveys, expert interviews, and automated reinforcement learning from human feedback (RLHF), to iteratively enhance the performance and reliability of GenAI models. By providing a holistic approach to evaluating and improving prompt-response effectiveness, our evaluation framework contributes to the development of more credible and user-friendly Computer.

Type

Conference Proceedings

Publication

37th International Conference on Computer Applications in Industry and Engineering (CAINE), October 21-22, 2024, San Diego, CA, USA.

Evaluating the Effectiveness of Prompts and Responses in Generative Artificial Intelligence Systems

Abstract

Ruida Zeng

Computer Scientist